Propositional Term Extraction over Short Text using Word Cohesiveness and Conditional Random Fields with Multi-Level Features
نویسندگان
چکیده
Propositional terms in a research abstract (RA) generally convey the most important information for readers to quickly glean the contribution of a research article. This paper considers propositional term extraction from RAs as a sequence labeling task using the IOB (Inside, Outside, Beginning) encoding scheme. In this study, conditional random fields (CRFs) are used to initially detect the propositional terms, and the combined association measure (CAM) is applied to further adjust the term boundaries. This method can extract beyond simply NP-based propositional terms by combining multi-level features and inner lexical cohesion. Experimental results show that CRFs can significantly increase the recall rate of imperfect boundary term extraction and the CAM can further effectively improve the term boundaries.
منابع مشابه
LSTM-CRF for Drug-Named Entity Recognition
Drug-Named Entity Recognition (DNER) for biomedical literature is a fundamental facilitator of Information Extraction. For this reason, the DDIExtraction2011 (DDI2011) and DDIExtraction2013 (DDI2013) challenge introduced one task aiming at recognition of drug names. State-of-the-art DNER approaches heavily rely on hand-engineered features and domain-specific knowledge which are difficult to col...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملMulti-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media
In this paper, we present our multichannel neural architecture for recognizing emerging named entity in social media messages, which we applied in the Novel and Emerging Named Entity Recognition shared task at the EMNLP 2017 Workshop on Noisy User-generated Text (W-NUT). We propose a novel approach, which incorporates comprehensive word representations with multichannel information and Conditio...
متن کاملExB Medical Text Miner
We present ExB Medical Text Miner – a text mining pipeline for processing biomedical documents. This application employs stateof-the-art Named Entity Recognition, using linguistic features and word embeddings in a fully-connected second-order Conditional Random Field model, as well as a novel two-stage Relation Extraction module that first detects entity-level relations using a Support Vector C...
متن کاملBroadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features
This paper proposes to integrate multi-modal features using conditional random fields (CRF) for broadcast news story segmentation. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness, acoustic features involve pause duration, pitch, speaker change and audio event type, and visual fea...
متن کامل